Whole Genome Analysis

100,000 Genomes Project Cancer Programme

Supplementary analysis of somatic variants v1.1

Variants present in the germline are subtracted to produce a list of somatic variants. Accordingly, variants detected in both the germline and the tumour will not be listed in this analysis.
The pathways for sample processing and data analysis are not yet accredited end to end for diagnostic use. Accordingly, any result intended for use in informing clinical management should be confirmed using a test accredited for clinical use.

Participant information

Participant name D.O.B Gender NHS number Laboratory sample ID GeL participant ID GMC Sample date Date analysis issued
XX

Tumour information

Tumour type Tumour subtype ICD10 code Sample type Reported tumour content Tumour sample cross-contamination
Colorectal adenocarcinoma N/A FF Medium 40-60% PASS

Sequencing quality information

See online Technical Information v1.1.main document and/or LabKey QC portal for details and expected ranges of QC metrics

Sample type Mapped reads, % Chimeric DNA fragments, % Insert size median, bp Genome-wide coverage mean, x Unevenness of local genome coverage, x COSMIC content with low coverage (<30x), % Total somatic SNVs Total somatic indels Total somatic SVs
Germline 95.38 0.40 482.8 29.3 6.65 N/A N/A N/A N/A
Tumour 95.68 0.37 447.4 84.6 13.45 1.15 31152 21184 262

PART ONE: DESCRIPTION OF SOMATIC VARIANTS

Circos plot: genome-wide visualisation of somatic variants and sequencing depth

This plot illustrates the distribution of somatic variants across the genome with each concentric circle (track) representing a different class of variant.

Chromosomes are arranged sequentially around the circumference as indicated. The information presented in each track is as follows:

Track 1 (innermost track): chromosomes

Track 2 (in red): number of somatic SNVs in 2Mb window; scale from 0 to 100

Track 3 (in green): number of somatic indels in 2Mb window; scale from 0 to 35

Track 4: ratio of normalised depth of coverage for tumour vs normal in log2 scale smoothed over 100 kb windows. Diploid regions have value of 0. Scale is between -2 and 2. Regions with coverage below 15x in germline are not shown. CNV losses are indicated in red, CNV gains are indicated in green, copy-neutral LOH regions are indicated in yellow.

Track 5 (outermost track, in blue): absolute depth of coverage in tumour sample

Structural variants (SVs) are indicated by arcs inside the plot; translocations are indicated in green, inversions are indicated in purple. SVs shorter than 100 kb and insertions are not plotted.

Small somatic variants detected

Only variants with specific consequences (transcript ablation, splice acceptor variant, splice donor variant, stop gained, frameshift variant, stop lost, initiator codon variant, transcript amplification, inframe insertion, inframe deletion, missense variant, splice region variant, incomplete terminal codon variant) in canonical transcripts are reported. The complete list of canonical transcripts can be accessed at List of canonical transcripts v1.1. Small variants are classified as SNVs and indels <50bp. Classification for gene mode of action (oncogene, tumour suppressor or both) was extracted from the manually curated list of Cancer Census Genes (see below).

Reported variants are classified into three domains:

Domain 1 variants
Variants in a virtual panel of potentially actionable genes*. Actionable genes are defined as genes in which small variants (SNVs and indels <50bp) have reported therapeutic, prognostic or clinical trial associations**, as defined by the GenomOncology Knowledge Management System. Where known, the “variant-level actionability” category and applicable tumour type are indicated. For other variants in these genes, their impact on gene function has not yet been characterised and therefore their actionability status is unclear. This means:
(i) local evaluation will be required for listed variants which are not yet characterised (i.e. “variant-level actionability” is denoted N/A)
(ii) even if well characterised as actionable for some tumour types, the listed variants may not be actionable in the participant’s specific tumour type

*Current potentially actionable genes for solid tumours: 77 genes, listed at Actionable genes in solid tumour v1.1 document
**Links are provided to clinical trials within the United Kingdom which are both actively recruiting participants or closed to recruitment.

Gene Gene-level actionability GRCh38 coordinates
ref/alt allele
Transcript cDNA and protein change Predicted consequences Population germline allele frequency (1KG) VAF Alt allele/total read depth COSMIC ID Variant-level actionability Gene mode of action
ALK Therapeutic (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (solid neoplasm); Trial (solid neoplasm); Trial (solid neoplasm) 2:29220810
G>A
ENST00000389048 c.3541C>T
p.(Arg1181Cys)
missense_variant N/A 0.15 20/130 N/A Trial (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (solid neoplasm); Trial (solid neoplasm); Trial (solid neoplasm) oncogene
KRAS Therapeutic (colorectal ca); Therapeutic (NSC lung ca); Trial (colorectal ca); Trial (colorectal ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (NSC lung ca); Trial (solid neoplasm); Trial (solid neoplasm); Trial (solid neoplasm); Trial (solid neoplasm); Trial (glioma); Trial (MPNST); Trial (melanoma); Trial (neuroblastoma); Trial (rhabdoid tu); Trial (rhabdomyosarcoma); Trial (schwannoma); Trial (sarcoma-ST) 12:25225628
C>T
ENST00000256078 c.436G>A
p.(Ala146Thr)
missense_variant N/A 0.3 35/118 COSM19404 COSM1165198 Therapeutic (colorectal ca); Therapeutic (NSC lung ca); Trial (colorectal ca); Trial (NSC lung ca); Trial (colorectal ca); Trial (glioma); Trial (MPNST); Trial (melanoma); Trial (neuroblastoma); Trial (NSC lung ca); Trial (rhabdoid tu); Trial (rhabdomyosarcoma); Trial (schwannoma); Trial (sarcoma-ST); Trial (solid neoplasm); Trial (solid neoplasm); Trial (solid neoplasm) oncogene
TP53 Trial (ovarian ca) 17:7673796
C>A
ENST00000269305 c.824G>T
p.(Cys275Phe)
missense_variant N/A 0.49 40/82 COSM10701 COSM99932 COSM3723938 COSM1637959 Trial (ovarian ca) oncogene, tumour suppressor
Domain 2 variants
Variants in a virtual panel of cancer-related genes***. Cancer-related genes are defined as genes in which any variants have been causally implicated in cancer, as defined by the Cancer Gene Census (Wellcome Trust Sanger Institute)

***Current cancer-related genes: 590 genes, listed at Cancer census genes v1.1 document

Gene GRCh38 coordinates
ref/alt allele
Transcript cDNA and protein change Predicted consequences Population germline allele frequency (1KG) VAF Alt allele/total read depth COSMIC ID Gene mode of action
APC 5:112839499
TGCAA>T
ENST00000508376 c.3906_3909delGCAA
p.(Leu1302>fs)
frameshift_variant N/A 0.13 14/110 N/A tumour suppressor
CREBBP 16:3758051
T>TA
ENST00000262367 c.3370-4dupT splice_region_variant N/A 0.2 12/61 N/A oncogene, tumour suppressor
FAT1 4:186620732
C>T
ENST00000441802 c.5854G>A
p.(Val1952Ile)
missense_variant N/A 0.28 28/99 COSM1054196 COSM1054194 tumour suppressor
FIP1L1 4:53453080
CAG>C
ENST00000337488 c.1459_1460delAG
p.(Arg483>fs)
frameshift_variant N/A 0.11 9/83 COSM249696 COSM4435275 N/A
IDH2 15:90088607
T>C
ENST00000330062 c.514A>G
p.(Arg172Gly)
missense_variant N/A 0.24 27/112 COSM33731 oncogene
Domain 3 variants
Small variants in genes not included in domains 1 & 2.
When connected to the Internet columns can be sorted by clicking on column heading and individual terms (e.g. gene names) can be located using the search function
Click to collapse/expand

Structural variants detected

For details of the algorithms used to call CNVs and SVs please refer to Technical Information v1.1.main
Copy number variants (CNVs)
Only CNVs overlapping with introns or exons are listed in the table below
When connected to the Internet columns can be sorted by clicking on column heading and individual terms (e.g. gene names) can be located using the search function
Click to collapse/expand
Structural variants (SVs)
Only SVs with breakends within an intron or an exon are listed in the table below. Each row corresponds to one structural variant with two breakends. Types of structural variant represented are: BND = translocation, DEL = deletion, DUP = duplication, INV = inversion, INS = insertion. Coordinate for the second breakend in translocation event captures replacement string, position and direction according to variant call format specification v4.3
When connected to the Internet columns can be sorted by clicking on column heading and individual terms (e.g. gene names) can be located using the search function
Click to collapse/expand

PART TWO: ANALYSIS OF SOMATIC VARIANTS

Somatic mutation prevalence (global mutation burden)

Total number of somatic SNVs: 31152

Total number of somatic SNVs per megabase: 10.12

Total number of somatic non-synonymous SNVs per megabase (coding region): : 12.13

Further analyses are under development (pending accrual of sufficient WGS data) for tumour type specific mutational burden comparison

Contextual analysis of somatic SNVs

The following histogram represents the contextual frequency of each type of base substitution. The counts of each mutation-type (i.e. base substitution) at each mutation context (i.e. base situated immediately 3’ and 5’ to the mutated nucleotide) are corrected for the frequency of each tri-nucleotide in the reference genome. All substitutions are referred to by the pyrimidine context of the mutated base pair. Mutation types are given on the horizontal axis while the percentage of mutations attributed to a specific mutation type are on the vertical axis.

Mutational signature analysis

The following bar plot displays the relative proportions of the different mutational signatures demonstrated by the tumour. Analysis of large sequencing datasets (10,952 exomes and 1,048 whole-genomes from 40 distinct tumour types) has allowed patterns of relative contextual frequencies of different SNVs to be grouped into specific mutational signatures. Using mathematical methods (decomposition by non-negative least squares) the contribution of each of these signatures to the overall mutation burden observed in a tumour can be derived. Further details of the 30 different mutational signatures used for this analysis, their prevalence in different tumour types and proposed aetiology can be found at the Sanger Institute Website.

Analysis of clusters of somatic SNVs (rain plot)

This plot presents SNVs in positional order (from the first variant on the short arm of chromosome 1 to the last variant on the long arm of chromosome Y) on the X-axis with the distance between consecutive SNVs in logarithmic scale indicated on the Y-axis. The colour of each dot indicates the type of substitution (see legend). Regions of localized hypermutation can be observed as clusters of SNVs that have lower inter-mutation distance and show similar base changes.
Putative regions of hypermutation are indicated by black arrows on this plot and have been determined as regions containing six or more consecutive mutations with an average intermutation distance of less than or equal to 1,000 bp.

The table below indicates the genomic coordinates of individual regions of hypermutation.
Click to collapse/expand.

Analysis of variant allele frequency (VAF) of small somatic variants and indel lengths

The following histograms show VAF and length distributions for small somatic variants.

VAF is calculated as alt/(alt + ref) where alt and ref are the number of reads passing filter (see Technical Information v1.1.main) supporting the non-reference and reference base respectively. VAF depends on tumour purity, cancer heterogeneity and copy number variants.

The negative values on the length distribution plot correspond to deletions and the positive values correspond to insertions

Additional Information

Genomics England
Queen Mary University of London
Dawson Hall
Charterhouse Square
London
EC1M 6BQ

Sequencing Laboratory
Illumina Laboratory Services United Kingdom - Hinxton
The Ogilvie Building, Wellcome Trust Genome Campus
Hinxton Nr Saffron Walden
Essex
CB10 1DR